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DETAILED ACTION 

1. Claims 1-8, 10-22, 24-28 and 41-43 have been examined. Application 
09/879,491 (METHOD AND SYSTEM FOR PREDICTING CUSTOMER BEHAVIOR 
BASED ON DATA NETWORK GEOGRAPHY) has a filing date 06/12/2001 . 

REASON FOR ALLOWANCE 

2. Due to Board Decision filed 05/28/2009, the Application is allowed as the Board 
found that the prior arts Menon (US 5,537,488), Wu (US 6,741,967) and Appellant's 
background of the Invention do not teach Appellant's invention. 

EXAMINER'S AMENDMENT 

3. An examiner's amendment to the record appears below. Should the changes 
and/or additions be unacceptable to applicant, an amendment may be filed as provided 
by 37 CFR 1.312. To ensure consideration of such an amendment, it MUST be 
submitted no later than the payment of the issue fee. 

Authorization for this examiner's amendment was given in a telephone interview 
with Wayne P. Bailey on 09/21/2009. 

1. (Currently Amended) A data processing machine implemented method of 
selecting data sets for use with a predictive algorithm based on data network 
geographical information, comprising data processing machine implemented steps of: 

generating , by the data processing machine, a first statistical distribution of a 
training data set; 

generating , by the data processing machine, a second statistical distribution of a 
testing data set; 
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using , by the data processing machine, the first statistical distribution and the 
second statistical distribution to identify a discrepancy between the first statistical 
distribution and the second statistical distribution with respect to the data network 
geographical information by comparing at least one of the first statistical distribution and 
the second statistical distribution to a statistical distribution of a customer database to 
determine if at least one of the training data set and the testing data set are 
geographically representative of a customer population represented by the customer 
database; 

modifying , by the data processing machine, selection of entries in one or more of 
the training data set and the testing data set based on the discrepancy between the first 
statistical distribution and the second statistical distribution; and 

using the modified selection of entries by the predictive algorithm. 

2. The method of claim 1, wherein the first statistical distribution and the second 
statistical distribution are distributions of a number of data network links from a 
customer data network geographical location to a web site data network geographical 
location. 

3. The method of claim 1 , wherein the first statistical distribution and the second 
statistical distribution are distributions of a size of a click stream for arriving at a web site 
data network geographical location. 

4. The method of claim 1, wherein comparing the first statistical distribution and 
the second statistical distribution includes comparing one or more of a mean, mode, and 
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standard deviation of the first statistical distribution to one or more of a mean, mode, 
and standard deviation of the second statistical distribution. 

5. The method of claim 1 , wherein the first statistical distribution and the second 
statistical distribution are distributions of a weighted data network geographical distance 
between a customer data network geographical location and a web site data network 
geographical locations. 

6. The method of claim 1, wherein the first statistical distribution and the second 
statistical distribution are distributions of a weighted click stream for arriving at a web 
site data network geographical locations. 

7. The method of claim 1, wherein modifying selection of entries in one or more 
of the training data set and the testing data set includes generating recommendations 
for improving selection of entries in one or more of the training data set and the testing 
data set, and wherein the method of claim 1 further comprises re-generating at least 
one of the first statistical distribution and the second statistical distribution based upon 
the recommendations. 

8. The method of claim 1, wherein the training data set and the testing data set 
are selected from a customer information database comprising information with respect 
to customers who have purchased any of goods and services over a data network, 
wherein the data network geographic information pertains to geographic information of 
the data network. 

10. The method of claim 1, wherein the first statistical distribution and second 
statistical distribution are frequency distributions of number of data network links 
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between a customer geographical location and one or more web site data network 
geographical locations, and size of a click stream for arriving at one or more web site 
data network geographical locations. 

1 1 . The method of claim 1 , wherein comparing at least one of the first statistical 
distribution and the second statistical distribution to a statistical distribution of a 
customer database includes: generating a composite data set from the training data set 
and the testing data set; and 

generating a composite statistical distribution from the composite data set that 
was generated from the training data set and the testing data set. 

12. The method of claim 1, wherein modifying selection of entries in one or more 
of the training data set and the testing data set includes changing one of a random 
selection algorithm and a seed value for the random selection algorithm, and then re- 
comparing the first statistical distribution and the second statistical distribution. 

13. The method of claim 1 , wherein using the modified selection of entries by the 
predictive algorithm includes training the predictive algorithm using at least one of the 
training data set and the testing data set if the discrepancy is within a predetermined 
tolerance. 

14. The method of claim 13, wherein the predictive algorithm is a discovery 
based data mining algorithm. 

15. An apparatus for selecting data sets for use with a predictive algorithm based 
on data network geographical information, comprising: 

a statistical engine; 
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a comparison engine coupled to the statistical engine, wherein the statistical 
engine generates a first statistical distribution of a training data set and a second 
distribution of a testing data set, the comparison engine uses the first statistical 
distribution and the second distribution to identify a discrepancy between the first 
statistical distribution and the second distribution with respect to the data network 
geographical information by comparing at least one of the first statistical distribution and 
the second statistical distribution to a statistical distribution of a customer database to 
determine if at least one of the training data set and the testing data set are 
geographically representative of a customer population represented by the customer 
database, modifies selection of entries in one or more of the training data set and the 
testing data set based on the discrepancy between the first statistical distribution and 
the second distribution, and provides the modified selection of entries for use by the 
predictive algorithm; and 

a predictive algorithm device that uses the modified selection of entries and the 
predictive algorithm. 

16. The apparatus of claim 15, wherein the first statistical distribution and the 
second statistical distribution are distributions of a number of data network links from a 
customer data network geographical location to a web site data network geographical 
location. 

17. The apparatus of claim 15, wherein the first statistical distribution and the 
second statistical distribution are distributions of a size of a click stream to arrive at a 
web site data network geographical location. 
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18. The apparatus of claim 15, wherein the comparison engine compares the first 
statistical distribution and the second statistical distribution by comparing one or more of 
a mean, mode, and standard deviation of the first statistical distribution to one or more 
of a mean, mode, and standard deviation of the second statistical distribution. 

19. The apparatus of claim 15, wherein the first statistical distribution and the 
second statistical distribution are distributions of a weighted number of data network 
links between a customer data network geographical location and a web site data 
network geographical location. 

20. The apparatus of claim 15, wherein the first statistical distribution and the 
second statistical distribution are distributions of a weighted size of a click stream to 
arrive at a web site data network geographical location. 

21 . The apparatus of claim 15, wherein the comparison engine modifies selection 
of entries in one or more of the training data set and the testing data set by generating 
recommendations for improving selection of entries in one or more of the training data 
set and the testing data set, and wherein the statistical engine re-generates at least one 
of the first statistical distribution and the second statistical distribution based upon the 
recommendations. 

22. The apparatus of claim 15, further comprising a training data set/testing data 
set selection device that selects the training data set and the testing data set from a 
customer information database comprising information with respect to customers who 
have purchased any of goods and services over a data network, wherein the data 
network geographic information pertains to geographic information of the data network. 
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24. The apparatus of claim 15, wherein the first statistical distribution and second 
statistical distribution are frequency distributions of a number of data network links 
between a customer data network geographical location and one or more web site data 
network geographical locations, and a size of a click stream to arrive at one or more 
web site data network geographical locations. 

25. The apparatus of claim 15, wherein the comparison engine compares at least 
one of the first statistical distribution and the second statistical distribution to a statistical 
distribution of a customer database by: 

generating a composite data set from the training data set and the testing data 
set; and 

generating a composite statistical distribution from the composite data set that 
was generated from the training data set and the testing data set. 

26. The apparatus of claim 15, wherein the comparison engine modifies selection 
of entries in one or more of the training data set and the testing data set by changing 
one of a random selection algorithm and a seed value for the random selection 
algorithm, and then re-comparing the first statistical distribution and the second 
statistical distribution. 

27. The apparatus of claim 15, wherein the predictive algorithm device is trained 
using at least one of the training data set and the testing data set if the discrepancy is 
within a predetermined tolerance. 

28. The apparatus of claim 27, wherein the predictive algorithm is a discovery 
based data mining algorithm. 
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41. A data processing machine implemented method of predicting customer 
behavior based on data network geographical influences, comprising data processing 
machine implemented steps of: 

obtaining data network geographical information regarding a plurality of 
customers, the data network geographic information comprising frequency distributions 
of both (i) number of data network links between a customer geographical location and 
one or more web site data network geographical locations, and (ii) size of a click stream 
for arriving at the one or more web site data network geographical locations; 
training a predictive algorithm using the data network geographical information; and 
using the predictive algorithm to predict customer behavior based on the data network 
geographical information. 

42. An apparatus for predicting customer behavior based on data network 
geographical influences, comprising: 

means for obtaining data network geographical information regarding a plurality 
of customers, the data network geographic information comprising frequency 
distributions of both (i) number of data network links between a customer geographical 
location and one or more web site data network geographical locations, and (ii) size of a 
click stream for arriving at the one or more web site data network geographical 
locations; 

means for training a predictive algorithm using the data network geographical 
information; and 
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means for using the predictive algorithm to predict customer behavior based on 
the data network geographical information. 

43. A computer program product in a computer readable medium comprising 
instructions for enabling a data processing machine to predict customer behavior based 
on data network geographical influences, comprising: 

first instructions for obtaining data network geographical information regarding a 
plurality of customers, the data network geographic information comprising frequency 
distributions of both (i) number of data network links between a customer geographical 
location and one or more web site data network geographical locations, and (ii) size of a 
click stream for arriving at the one or more web site data network geographical 
locations; 

second instructions for training a predictive algorithm using the data network 
geographical information; and 

third instructions for using the predictive algorithm to predict customer behavior 
based on the data network geographical information. 

Conclusion 

4. Any inquiry concerning this communication or earlier communications from the 
examiner should be directed to DANIEL LASTRA whose telephone number is 571-272- 
6720 and fax 571-273-6720. The examiner can normally be reached on 9:30-6:00. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's 
supervisor, ROBERT A WEINHARDT can be reached on (571)272-6633. The 
official Fax number is (571) 273-8300. 
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Information regarding the status of an application may be obtained from the 
Patent Application Information Retrieval (PAIR) system. Status information for published 
applications may be obtained from either Private PAIR or Public PAIR. Status 
information for unpublished applications is available through Private PAIR only. For 
more information about the PAIR system, see http://pair-direct.uspto.gov. Should you 
have questions on access to the Private PAIR system, contact the Electronic Business 
Center (EBC) at 866-217-9197 (toll-free). 
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